摘要 :
Computational biology, a phrase coined by analogy withcomputing's significance in the physical sciences,is resurfacing asa critical component of current biological and medical specialtyresearch.From guidingthedirection ofexperimen...
展开
Computational biology, a phrase coined by analogy withcomputing's significance in the physical sciences,is resurfacing asa critical component of current biological and medical specialtyresearch.From guidingthedirection ofexperimentalinvestigations to giving data and insight that can't be achievedany other way; scientific discipline and process science are criticaltools for next-generation biological endeavors. On the otherhand,process approaches can provide an underpinning for themixing of broad disciplines for the development of a quantitativesystems approach to understanding the mechanisms within thelifetime of the cell,as reflected in the successes of the ordinationproject and driven by the ability of biology techniques.
收起
摘要 :
Significant technical advances in imaging, molecular biology and genomics have fueled a revolution in cell biology, in that the molecular and structural processes of the cell are now visualized and measured routinely. Driving much...
展开
Significant technical advances in imaging, molecular biology and genomics have fueled a revolution in cell biology, in that the molecular and structural processes of the cell are now visualized and measured routinely. Driving much of this recent development has been the advent of computational tools for the acquisition, visualization, analysis and dissemination of these datasets. These tools collectively make up a new subfield of computational biology called bioimage informatics, which is facilitated by open source approaches. We discuss why open source tools for image informatics in cell biology are needed, some of the key general attributes of what make an open source imaging application successful, and point to opportunities for further operability that should greatly accelerate future cell biology discovery.
收起
摘要 :
Binary repeated measures data are commonly encountered in both experimental and observational veterinary studies. Among the wide range of statistical methods and software applicable to such data one major distinction is between ma...
展开
Binary repeated measures data are commonly encountered in both experimental and observational veterinary studies. Among the wide range of statistical methods and software applicable to such data one major distinction is between marginal and random effects procedures. The objective of the study was to review and assess the performance of marginal and random effects estimation procedures for the analysis of binary repeated measures data. Two simulation studies were carried out, using relatively small, balanced, two-level (time within subjects) datasets. The first study was based on data generated from a marginal model with first order autocorrelation, the second on a random effects model with autocorrelated random effects within subjects. Three versions of the models were considered in which a dichotomous treatment was modelled additively, either between or within subjects, or modelled by a time interaction. Among the studied statistical procedures were: generalized estimating equations (GEE), Marginal Quasi Likelihood, likelihood based on numerical integration, penalized quasi-likelihood, restricted pseudo likelihood and Bayesian Markov Chain Monte Carlo. Results for data generated by the marginal model showed autoregressive GEE to be highly efficient when treatment was within subjects, even with strongly correlated responses. For treatment between subjects, random effects procedures also performed well in some situations; however, a relatively small number of subjects with a short time series proved a challenge for both marginal and random effects procedures. Results for data generated by the random effects model showed bias in estimates from random effects procedures when autocorrelation was present in the data, while the marginal procedures generally gave estimates close to the marginal parameters.
收起
摘要 :
Binary repeated measures data are commonly encountered in both experimental and observational veterinary studies. Among the wide range of statistical methods and software applicable to such data one major distinction is between ma...
展开
Binary repeated measures data are commonly encountered in both experimental and observational veterinary studies. Among the wide range of statistical methods and software applicable to such data one major distinction is between marginal and random effects procedures. The objective of the study was to review and assess the performance of marginal and random effects estimation procedures for the analysis of binary repeated measures data. Two simulation studies were carried out, using relatively small, balanced, two-level (time within subjects) datasets. The first study was based on data generated from a marginal model with first order autocorrelation, the second on a random effects model with autocorrelated random effects within subjects. Three versions of the models were considered in which a dichotomous treatment was modelled additively, either between or within subjects, or modelled by a time interaction. Among the studied statistical procedures were: generalized estimating equations (GEE), Marginal Quasi Likelihood, likelihood based on numerical integration, penalized quasi-likelihood, restricted pseudo likelihood and Bayesian Markov Chain Monte Carlo. Results for data generated by the marginal model showed autoregressive GEE to be highly efficient when treatment was within subjects, even with strongly correlated responses. For treatment between subjects, random effects procedures also performed well in some situations; however, a relatively small number of subjects with a short time series proved a challenge for both marginal and random effects procedures. Results for data generated by the random effects model showed bias in estimates from random effects procedures when autocorrelation was present in the data, while the marginal procedures generally gave estimates close to the marginal parameters.
收起
摘要 :
In this paper we report a successful application of machine learning approaches to the prediction of chemical carcinogenicity. Two different approaches, namely a support vector machine (SVM) and artificial neural network (ANN), we...
展开
In this paper we report a successful application of machine learning approaches to the prediction of chemical carcinogenicity. Two different approaches, namely a support vector machine (SVM) and artificial neural network (ANN), were evaluated for predicting chemical carcinogenicity from molecular structure descriptors. A diverse set of 844 compounds, including 600 carcinogenic (CG+) and 244 noncarcinogenic (CG-) molecules, was used to estimate the accuracies of these approaches. The database was divided into two sets: the model construction set and the independent test set. Relevant molecular descriptors were selected by a hybrid feature selection method combining Fischer's score and Monte Carlo simulated annealing from a wide set of molecular descriptors, including physiochemical properties, constitutional, topological, and geometrical descriptors. The first model validation method was based a five-fold cross-validation method, in which the model construction set is split into five subsets. The five-fold cross-validation was used to select descriptors and optimise the model parameters by maximising the averaged overall accuracy. The final SVM model gave an averaged prediction accuracy of 90.7% for CG+ compounds, 81.6% for CG- compounds and 88.1% for the overall accuracy, while the corresponding ANN model provided an averaged prediction accuracy of 86.1% for CG+ compounds, 79.3% for CG- compounds and 84.2% for the overall accuracy. These results indicate that the hybrid feature selection method is very efficient and the selected descriptors are truly relevant to the carcinogenicity of compounds. Another model validation method, i.e. a hold-out method, was used to build the classification model using the selected descriptors and the optimised model parameters, in which the whole model construction set was used to build the classification model and the independent test set was used to test the predictive ability of the model. The SVM model gave a prediction accuracy of 87.6% for CG+ compounds, 79.1% for CG- compounds and 85.0% for the overall accuracy. The ANN model gave a prediction accuracy of 85.6% for CG+ compounds, 79.1% for CG- compounds and 83.6% for the overall accuracy. The results indicate that the built models are potentially useful for facilitating the prediction of chemical carcinogenicity of untested compounds.
收起
摘要 :
In this paper we report a successful application of machine learning approaches to the prediction of chemical carcinogenicity. Two different approaches, namely a support vector machine (SVM) and artificial neural network (ANN), we...
展开
In this paper we report a successful application of machine learning approaches to the prediction of chemical carcinogenicity. Two different approaches, namely a support vector machine (SVM) and artificial neural network (ANN), were evaluated for predicting chemical carcinogenicity from molecular structure descriptors. A diverse set of 844 compounds, including 600 carcinogenic (CG+) and 244 noncarcinogenic (CG-) molecules, was used to estimate the accuracies of these approaches. The database was divided into two sets: the model construction set and the independent test set. Relevant molecular descriptors were selected by a hybrid feature selection method combining Fischer's score and Monte Carlo simulated annealing from a wide set of molecular descriptors, including physiochemical properties, constitutional, topological, and geometrical descriptors. The first model validation method was based a five-fold cross-validation method, in which the model construction set is split into five subsets. The five-fold cross-validation was used to select descriptors and optimise the model parameters by maximising the averaged overall accuracy. The final SVM model gave an averaged prediction accuracy of 90.7% for CG+ compounds, 81.6% for CG- compounds and 88.1% for the overall accuracy, while the corresponding ANN model provided an averaged prediction accuracy of 86.1% for CG+ compounds, 79.3% for CG- compounds and 84.2% for the overall accuracy. These results indicate that the hybrid feature selection method is very efficient and the selected descriptors are truly relevant to the carcinogenicity of compounds. Another model validation method, i.e. a hold-out method, was used to build the classification model using the selected descriptors and the optimised model parameters, in which the whole model construction set was used to build the classification model and the independent test set was used to test the predictive ability of the model. The SVM model gave a prediction accuracy of 87.6% for CG+ compounds, 79.1% for CG- compounds and 85.0% for the overall accuracy. The ANN model gave a prediction accuracy of 85.6% for CG+ compounds, 79.1% for CG- compounds and 83.6% for the overall accuracy. The results indicate that the built models are potentially useful for facilitating the prediction of chemical carcinogenicity of untested compounds.
收起
摘要 :
About 20%-30% of genome products have been predicted as membrane proteins, which have significant biological functions. The prediction of the amount and position for the transmembrane protein helical segments (TMHs) is the hot spo...
展开
About 20%-30% of genome products have been predicted as membrane proteins, which have significant biological functions. The prediction of the amount and position for the transmembrane protein helical segments (TMHs) is the hot spot in bioinformatics. In this paper, a new approach, maximum spectrum of continuous wavelet transform (MSCWT), is proposed to predict TMHs. The predictions for eight SARS-CoV membrane proteins indicate that MSCWT has the same capacity with software TMpred. Moreover, the test on a dataset of 131 structure-known proteins with 548 TMHs shows that the prediction accuracy of MSCWT for TMHs is 91.6% and that for membrane protein is 89.3%.
收起
摘要 :
P>This note provides a description of software that allows to fit Bayesian genetically structured variance models using Markov chain Monte Carlo (MCMC). The gsevm v.2 program was written in Fortran 90. The DOS and Unix executable ...
展开
P>This note provides a description of software that allows to fit Bayesian genetically structured variance models using Markov chain Monte Carlo (MCMC). The gsevm v.2 program was written in Fortran 90. The DOS and Unix executable programs, the user's guide, and some example files are freely available for research purposes at http://www.bdporc.irta.es/estudis.jsp. The main feature of the program is to compute Monte Carlo estimates of marginal posterior distributions of parameters of interest. The program is quite flexible, allowing the user to fit a variety of linear models at the level of the mean and the logvariance.
收起
摘要 :
Motivation: Gillespie's stochastic simulation algorithm (SSA) is often the most tractable method to study stochastic models of biochemical systems. The algorithm itself is very simple and a natural target for implementation on spe...
展开
Motivation: Gillespie's stochastic simulation algorithm (SSA) is often the most tractable method to study stochastic models of biochemical systems. The algorithm itself is very simple and a natural target for implementation on specialized architectures such as the Cell Broadband Engine (Cell/BE). We have developed CellMC, a multiplatform SBML model compiler implementing a vectorized version of SSA for use on Cell/BE or x 86 PCs.
收起
摘要 :
Phosphorus (P) mobility in soils is controlled by its interaction with the soil matrix, nutrients, and amendments. The aim of this study was to test the hypothesis that soils with different pH differ in their ability to bind P as ...
展开
Phosphorus (P) mobility in soils is controlled by its interaction with the soil matrix, nutrients, and amendments. The aim of this study was to test the hypothesis that soils with different pH differ in their ability to bind P as influenced by the presence of the common cations and anions that are added to soil with fertilizers, Therefore, this study was conducted to investigate the effect of common fertilizer ions on phosphorus sorption and liability characteristics in three soils differing ill their pH. Phosphorus sorption isotherms were assessed on an acid, neutral, and alkaline alfisol in background solutions containing one of the ions K+, NH4+, Ca-2(+), NO3, HCO3, or SO42. The Freundlich equation was adjusted to describe the sorption. In addition, distribution coefficients values were obtained for soil and background electrolyte. Lability of the sorbed P was evaluated by NaHCO3 extraction after its sorption.The Freundlich equation fitted closely to the sorption data. Alkaline soil exhibited greater P sorption than the acid and neutral soils. Both K+ and NH4+ equally decreased P sorption as opposed to Ca2+, with more P in the labile form in all the soils studied. Phosphorus sorption was enhanced by HCO3- compared With SO42- and NO3 at all pit levels, Both NO3 and SO42- increased the labile P compared with HCO3 Sulfate addition, however, resulted in more P in the labile pool compared with NO3 in acid and neutral soils, whereas NO3 addition resulted in the highest amount of the labile P form in the alkaline soil. These results have important implications oil P management in relation to other nutrients.
收起